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NOTE. 

In August, 1899, I presented a memoir to the Eoyal Society on the inheritance of coat-colour in the 
horse and of eye-colour in man, which was read November, 1899, and ultimately ordered to be published in 
the ' Phil. Trans.' Before that memoir was printed, Mr. Yule'kS valuable memoir on Association was read, 
and, further, Mr. Leslie Bramley-Moore showed me that the theory of my memoir as given in § 6 of the 
present memoir led to somewhat divergent results according to the methods of proportioning adopted. 
We therefore undertook a new investigation of the theory of the whole subject, which is embodied in the 
present memoir. The data involved in the paper on coat-colour in horses and eye-colour in man have all 
been recalculated, and that paper is nearly ready for presentation."^ But it seemed best to separate the 
piu^ely theoretical considerations from their application to special cases of inheritance, and accordingly the 
old memoir now reappears in two sections. The theory discussed in this paper was, further, the basis of a 
paper on the Law of Eeversion with special reference to the Inheritance of Coat-colour in Basset Hounds 
recently communicated to the Society, and about to appear in the * Proceedings.'! 

While I am responsible for the general outlines of the present paper, the rough draft of it was 
taken up and carried on in leisure moments by Mr. LEkSLIe Bramley-Moore, Mr. L. N. G-. Filon, M.A., 
and Miss Alice Lee, D.Sc. Mr. Bramley-Moore discovered the w-functions ; Mr. Filon proved most of 
their general properties and the convergency of the series; I alone am responsible for sections 4, 5, and 6. 
Mr. Leslie Bramley-Moore sent me, without proof, on the eve of his departure for the Cape, the 
general expansion for on p. 26. I am responsible for the present proof and its applications. To Dr. 
Alice Lee we owe most of the illustrations and the table on p. 17. Thus the work is essentially a 
joint memoir in which we have equal part, and the use of the first personal pronoun is due to the fact 
that the materi al had to be put together and thrown into form by one of our number. — K. P. 
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(1.) On a General Theorem in Normal Correlation, 



Let the frequency surface 



N 



2^^^/{l - r^)o-^a,^ 



■ e 






where 



N = total number of observations, 



CTi, CTf 



15 



standard deviations of organs x and y. 



r = correlation of x and y, 



be divided into four parts by two planes at right angles to the axes of x and y at 
distances Ji and Jc from the origin. The total volumes or frequencies in these parts 
will be represented by a, &, c, and d in the manner indicated in the accompanymg 
plan : — - 

^/ 

Tdible of Frequencies 
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d + b 
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c+d 
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b^d 
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Then clearly 



d 



27rv/(l 






x^ y^ 2rxy 



) dxdy 



.00 



00 



if 



h z=L Ji /(Ti and h = ^7^2- 



(i.): 
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Further, 



h -{- d = 



N 



\/27r£7 

_N_ 

^/27^ J 






00 



e 



— irS 7 



c * i I 



f • ♦ • 



and 



(C6 + c) — (^ + cQ 
(ct 4- 5) ^ ( ^ + ^) 



N 



7i 



.00 



i/27r 






(iii.), 



7j 

e~**'o?cc (iv.), 






Thus, when a, &, c, and c? are known, h and ^ can be found by the ordinary table of 
the probabiKty integral, say that of Mr. Sheppard (' Phil. Trans.,' A, vol. 192, p. 167, 
Table VI. ^). The limits accordingly of the integral for d in (i.) are known. 

Now consider the expression 



\/l 



6 ^i ~ f'i 



r, {^^ -f y^ — 'irxy) 



U, say, . . 



6 < « » 



and let us expand it in powers of r. Then, if the expansion be 



. (vi.), 



u 



e ^(^ + 2' > Uq 4- -ji- 4- -^ + . . . +^ +• • • 



* i 



. (vii-)' 



we shall have 



u„ — e^<^' + ^'> 






* « 



(viii.). 



Taking logarithmic dijfferentials, we get at once 



dr 



(1 — r^y ^ = [xy + ?^(1 — x^ — y^) + 7'^;r^y — - r^}U. 



Differentiating n times by Leibnitz's theorem, and putting r = 0, we have, after 
some reductions 



u 



n+i 



n{2n 



X 



3 



y ju^^i 



Hence we find 



— n{n — 1) (n — 2)^t&^_3 
+ xy{u,, + n{n — l)u,t^^} 



• » e 



. (ix.). 



u 







^ 



tfl 



«3 = 



U< 



Ui = 



xy 

x(x^ — 3)y{y^ — S) 

(cc* - 6x^ + 3) (2/* — 6f + 3) 

* See, however, foot-note, p. 5. 
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Thus the following laws are indicated :- 



Uj^ — ~ Vfi J\ Wfi 



where 






We shall now show that these laws hold good by induction. Assume 



jLXIUo 






But by (ix.)j substituting for %i.n^^ from (xi.) and (xiii.), 

Un^^ = xy {VnW,, + 7h{n — l)Vn^^Wn^^\ + n{2n — 1 --»- X^ — f)%t^^n^-\ 

+ ti(n -- 1) {yv,,^jtii,,^^ + a?'y,e«™2ta;i,^i)- 

+ n(n — 1) (y v,,^iW,,^3 + a?t?«^3m;,^i) 

+ XtV,t_j{xVu^l' 

= 1^,^+1^;^+!, as we have seen above. 
Thus, if the theorem holds for u,,, it holds for if.,,,^^. Accordingly 



n 



V 



. i yk.il, I, 

(xiil). 



1 Vji^,,2 1 



<3-«''^^^>(l+"fV + ^fr^ + . . . +^V+ . . .) , . . (xiv.), 



where the v's and w's are given by (x,), (xii.), and (xiii.). 

"1 -,00 ^00 

U dx dy consists of a series of which the general 



It is thus clear that 



27r 



h •' fe 



term is 



1 



n 



Y„ W„ r« 



.00 



Yf 



here 






T^T 



fl 



1 r 

V2w J|j 






e^^^' iD,4y, 



It remains to find these integrals. 
The general form of % is given by 



v., 



H 



•A/ 



u 






3) 



a?^ 



,?i--i 



&c. ^ . (xv.). 
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For this obviously gives (x.). Assume it true for v,,_i and v,,,,^^ then 

/ , X , 0^ - 1) (n - 2) ,, n , (n - 1) (n - 2) (n - 3) (n - 4) , 

t^t///_2 V^ -*■/ ^/^■— 3 — 9 1 • 93 1 9 • • • 



f -, \ ., o . 0^ — 1) (^^ — 2) 0^ — 3) . 

1/6' Jl } tAj ^^ 9(1 *^ 



9 1"'^ "T 92 1 9 ^ -^ 



a" — 2- 1 ■" "^ 23 1 2 



• • « 



Thus the expression (xv.) is shown to hold by induction, the general terms being 



/ . \ (^^ — 1) 0^ — 2) . . . (n — 2r + 1) fn — 2r . \, ^ 
\ ^) 2^'-' jr- 1 l^ 2r -T^j^ 



/ 1 y tlO^-l)0^^2) . . . 0t-2r + 1) .^_^^, 



or the general term in Un, 
We notice at once that 



Thus, by (xiL) 



-^ = nv„_^ (xvi.). 



V,)i — ^^n^\ 



dx 



Multiply by e ^"^^ and integrate 

Integrating the latter integral by parts, we have 

IVffi ^^ dec = — e ^^ t^^^^i, 



.00 

TT 1 ( 

or 



1 r"" 1 



1 

Now^ -y=: e ^^'' can be found from any table of the ordinates of the normal curve, 

e.g., Mr. Sheppard's, ' Phil. Trans.,' A, vol. 192, p. 153, Table I.^^ We shall accord- 
ingly put 

1 1 

H = 77f=e-*'^V ■ - K = ~=^e'^i^\ .... .(xvii.), 

and look upon H and K as known quantities. 

■^ For our present purposes the differences of Mr. Shepp^rd's tables are occasionally too large, but the 
following series give very close results :— 

Let Xi = sj\ ('^ + «) ^ (& + 4 ^ f\-|.« ,fo by (iv.), 

X, = ^^ (fL±iL__OL±i) ^ f%-h.V^, by (v.). 
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Further, let us write (Vh^-^^ ^ /» as v„. ^, and similarly {w,i_j),j = * as iv„^^. Thus 



y _ XT .7 



y y ^^ »«.». ii_ • l/u ^^ __ ■ 



« « 



. (xviii.). 



We have then from (i.) 



N 



-I i»00 /.OO 

= ^ XJdxdy 



00 /^-.« 



■ -I /%00 /»00 



(b 4- (T?) (c + f/) 



CO / n'tU 



+ S |^HK^^,^l^t;,,„3 



by (ii.) and (iii.). 

Or, remembering that N = a + 6 •+ c + ^^^ we can write this 



ad — 5c 

N^HK 



= r -\- - Ilk + _ {h^ 

Zi 



l)(/«^-l) + i^(/^^-3)^(P-3) 



24 



+ 1^ (^* ~ 6A3 + 3) (A;* - 6F + 3) 



i>6 



+ 71^. ^(/^* - 10/i^ + \h)l{k^ - lOF + 15) 



720 
5040 



+ ?;t7;;(/^^ - IS/i* + 45^2 - 15) (P - 15/c4 + 45F - 15) 



'p.» l i wa 



.8 



-^o^Hh^ - 21/t* + 105/^2 _ 105)^(^8 - 2l¥ + 105^ - 105) + , &c. 

, I -A.1-A., J, 



t « 



Then 



^^ = Xi + -jy Xr + -gXi^ + -y- Xi^ + • • 



H 



1 '7 1 9'7 

ySTff 1 + _xi^ + rj Xi^ + ny Xi^ + • • 



and 



^7 






127 



X2 + -3- X2'^ + g- X2" + ^ 



X2 + 



K 



— / 1.7 

^27r ( 1 + -^ ^2'2 + y-- X2* + 



These follow from the considerations that if 



Xi 
dk 

m 

#1 



Jtl, 



11' 


C2 ^ |6 


X2 = 


V2i'^2, 


d<jy^ 
dk 


K, 


dK 


~ k 



127 , \ 

X2^ + • • ^ 



d^<^; 



whence it is easy to find the successiye differentials of h with regard to <^i and k with regard to ^3? ^^^ 
then obtain the above results by Maclaurin's theorem. There is, of course, no difficulty in calculating 
H and K from (xvii.) directly. That method was adopted in the numerical illustrations. 
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Here the left-hand side is known, and since h and k are known, we can find the 
coefficients of any number of powers of r so soon as the first two have been found, 
from (xii.) and (xiii.). 

Accordingly the correlation can be found if we have only made a grouping of our 
frequencies into the four divisions, a, 6, c, and d. 

If h and h be zero, we have from (xvii.) and (iv.) 



\/2tt 

The right-hand side of (xix.) is now 

' "i f o ' "T* t « • 

' I O ' 

or equal to sin""^ n 

(ctd — he) 



Hence r = sm27r 



W 



= cos7r^-. ......... .(xx.), 

which agrees with a result of Mr. Sheppard's, ' Phil. Trans.,' A, vol. 192, p. 141. We 
have accordingly reached a generalised form of his result for any class-index whatever. 
Clearly, also, r being known, we can at once calculate the frequency of pairs of organs 
with deviations as great as or greater than h and h 

§ (2.) Other Series for the Determdnation of r. 

For many purposes the series (xix. ) is sufficiently convergent to give r for given 
h and h with but few approximations, but we will now turn to other developments, 
nave by (vn.) 



rr 

Jo 



T / ^.,3 r^.n + 1 



UcZr == e""^^'^'"^^'Mi^Qr + -||~ + . . . + t*# ,^^7:771 + • • •)• 



Put X •=ih^ y =^h, and write for brevity 

ad — he 

It follows at once from (xix.) that 






^\ih^+h^) 



r 



V 1 — T^ 
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Jo 



6^ 



W 









\ 



? 7 



, I yv-A.,I.x, I , 



/9"-|(^' toi^ d — /i sec ^p ^7^ 



if r =: sin ^. 

Now either of the quantities under the sign of integration in (xxii.) can be expanded 
in powers of 6 by Maclaurin's theorem. Thus let 



X 



p—\ih tan Q — h sec ^f 



Xo + \ jg]^ ^ + ( ';7m j:v?, + • 



cW^ 



_r 



4" 



'(!» 



% 







n 



cW'^l^ \n 



) W i n *« w » w 



JL XJLtJxi 



e 



Xo^ + 



'c^ 









& 










+ . _ + 






H~ 



/ 



and it remains to find 



Now 
Hence 



loi 










^ = -— i (^ tan -^ h sec #)'^ 



X [(^^^ + ^^'^) ^^^^ ^ ~ ^^^^ {I "^ 4" ^^^^ 2^) 



Differentiating r^ ~- 1 times by Leibnitz's theorem, and putting = 0. 



cP'x 



4A/i 



V?«-i^ 



"T" 



\ ^^^-v ^ • • • 

{n -— 1). . . (^1— r + 1) 



r - 1 



2 [ 



^^g ^ ^ 1^3 _|. 3.^ 



T 



i{¥+k^) 



+ sm -7- 271/1.^ 



dW^' 



I S 1 9 



. 










Clearly ^^ =z e ^^'\ then we rapidly find. 



cWjQ 



hh e"''^'^ 



-m 



= hh {¥¥ - Z{¥ + F) + 5} 



Or, finally 



+ hhh0^ 



{J^^ + F ~ hW) y + hk{hm ^ 3(/i^ + ^') + 5} U + . . . (xxiv,), 



6 






where more terms if required can be found by (xxiii,). If be fairly small, ^^ will be 
negligible. Or if h and h be small, the lowest term in the next factor will be h^ 4 ^^ 
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and this into ^^/|5 is generally quite insensible. Very often two or three terms on the 
right-hand side of (xxiv.) give quite close enough values of 6^ and accordingly of 
r = sin^. (xxiv.) is clearly somewhat more convergent than (xix.) if /i and h are, as 
usually happens, less than unity. 

Returning now to (xix.), let us write it 

This is the equation that must be solved for r. Suppose ^q a root of this w^hen we 
retain only few terms on the right, say a root of the quadratic 

€ = r + ^hhr^. 
Then if r = ^q + />? 

e = f{r„h, h) + pf{r„ h, k) + \lp\f"{r„ h, k) + &c. 
Hence p = /. y)— to a third approximation 



3 





^/l-^ 



1 IjIlZ;;^.-^,,^ -nearly (xxv.). 



2 







which gives us a value of p which, substituted in p^ in the above equation, introduces 
only terms of the 6 th order in r^. 

Another integral expression for e of Equation (xxi.) may here be noticed : 






Hence 



Jo\/l -^^ 

J 0^/1—^2 



1 — 7' 

Let tan 26 =■ , or, r = cos 2 6. 

^ 1 4- r ^ 



Therefore 



r45° 



g-J(^= tm" ^ + y' cots ,f ) ^ J 



== 2e^(f^M\-iiS-'^'") 



dv 



1 -hv^ 

where v = cot (f> and is > 1. 

It seems possible that interesting developments for e might be deduced from this 
integral expression. 
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§ (3.) To shoiv that the Series for r is Convergent if r < 1, tvhatever he the Values 

of h and k 

Write the series in the form of p. 6, i,e, : — * 

6 = 8-: Vn^i Wn-Y 



Now 

From these we deduce 



by (xii.) and (xiil). 



V. 



n-{.i = [h^ ■— (2n — 1)} Vu^i — {n '— l){n — 2) v,,^ 



3 



w^^i = {k^ — (2n — 1)} Wn^i — (n — 1) (n — 2) w,^^^ 

Now let s,, = v,,„irM{\7i}^\ t^ = w^^jT^'' {\nY, 

Then we find 



h^ ^ (2n - 1) /(n --l)(n-- 2) 



3 

3 



^''-'^ 7o^+ly(^^ + 2) ^^^'^ V ^ + i)\^ + 2) ^^^-^^'' 

'^+^ V'(^ + 1) (7^ + 2) ^' V ^i(7i> 1) (^ + 2) "-^ • 

Thus, when n is large, we find the ratio of successive terms s^^^c^js^^ or tn-^^Jt^ is given 

by p, where 

p z=z — 2r — •* r^/p or, p = — n 

The ultimate ratio of s,i^^ t^^^^ to 5,, t^, is accordingly given by r^, but this is the 
ratio of alternate terms of the original series. The original series thus breaks up 
into two series, one of odd and one of even powers of r. Both these series are 
absolutely convergent whatever h and h be, having an ultimate convergence ratio of r ^ 

§ (4.) To find the Probable Error of the Correlation Coefficient as Determined by the 

Method of this Memoir. 

Given a division of the total frequency N into a, 6, c, d groups, where 
a4"& + c + ^ = N, then the probable error of any one of them, say a, is '67449 (r«, 
where"^ 

0-, = . /<^""^) (xxvl). 

Let b -^ d = Ui, c '{' d = n^^ then 



-^ 



The standard deviation of an event whicli happens np times and fails nq times in n trials is well 

known to be Jripq. The probable errors here dealt with are throughout, of course, those arising from 
different samples of the same general population 
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o-„, = AsJ 



^i(N - %) 



N 



cr 



», 



= v 



%(N' - %) 



N 



• t 



. (xxvii.). 



To obtain Ted we haYe^ if hi) denotes an error in any quantity % 



Sc + hd = hn 



2 



/. 0"/ 4" cr/ -f* 2cr^cF^f 



25 



£?^ 






% 






by squaring, summing for all possible variations in o and d, and dividing by the total 
number of variations, 

HencOj substituting the values of the standard deviations as found above, we 
deduce 

In a similar manner 



hi^d = 86 Sc? + (S(i) 



% 






and 



JV QTHH' 



^d^n^dn^ 



= OTbOTdnd + <r/ 

= d (a 4- <2)/N 
= d (a + 6)/N 



♦ • « 



• • * • 



2k.2x.2k.t /• 



Wi 



N 



foo 



\ Bfhi 



^/ lilt 






;t 









e - *''' SA 



NHS^. 



It 






and Bimilarly 



cr 



,^ -• — i- \J fff * »■ • « • • • • * I -cV-Ql .A.XX« / • 



cr 



#0 






Hence the probable error of h 



and of k 



-6^449 
'6H49 



V 

v 



(6 + d)(a + c) 



W 



(c + d) {a + b) 

IP 



* w -yi «|^^^ 4i^i^ ^ UW»u|^ K 

XXAlV.^j 



(xxxv.). 



They can be found at once, therefore, when H and K have been found from an 
ordinate table of the exponential curve, and a, h, e, d are given. We have thus the 
probable error of the means as found from any double grouping of observations. 

Next, noting that 

we have cr.. cr,, n. . = N^HK cr/, c^nh 



<rn,€r,,r.. 



or 



r 



f¥k 



nh 






12 PROFESSOE K. PEARSON ON MATHEMATICAL CONTRIBUTIONS 

But Sn^ 8% = {Bb + ^d) (S<^ + ^d), 

adj — 1)0 



N ^ 



a e » s < s • 



. (xxxvi.), 



therefore 



^h^Jc' Ilk — " "TT^TTTF" ' ■ . . , , . . (XXXVll. ), 

JMlJi. ^ ^ 

VCUj -—— 00 . • . • \ 

^ v/(& H- d) (a + e)(c^ d){a O) > - • • • l^^^vni.;. 

This is an important result ; it expresses the correlation between errors in the 
position of the means of the two characters under consideration. But if the prob- 
abilities were independent there could be no such correlation. Thus r/,^ might be 
taken as a measure of divergence from independent variation. We shall return to 
this point later. 

Since Sn^ = — HN 8/2^5 we have S^j^Sd = — HN8d8/i'5 whence we easily deduce 

' mil ' — ^^'^ «••»«..». I -A.Jviviiv, y« 

«i°^ilarly r,. = -u, (xl). 

Now d is a function of r, h, and h Hence if cl = f{r^ h^ h), 

df df djf 

Sd = — Sr + — hh + -r7 8^ 

dr dh dk 

= ygSr + 7i 8^ + 738^^ »••'•••• . (xli.). 

Whence transposing, squaring, summing, and dividing by the total number of 
observations, we find 

= (Id ^\^j o-,, +[j^J o-n, +2i^^jcrd(r,^rd„^ 

\HN/ ^^^ ^^^'^ '^^^^"'^ N^HK ^^^1 ^^'2 '^^v^ ..... (xln.). 

Substituting the values of the standard deviations and correlations as found above, 
we have 



a^ = 13"— ^ 



N7o^^ 



c?(a + 6 + c) + (fflf ) (<^^ + ^) (^ + ^) + ( ^^ I (<^ + c) (^^ + &) 

+ fHK ^^^ --- &c) + ji^ cZ(& + ») +^ c?(<^ + ^) r . . . (xliii). 
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X O 



It remains now to determine y^^ ji^ and y^* 
By Equation (i.) 

N f 



27rv/l 



r^Jh 



00 






CC w w t/j 






N" 



27rv'l 



k 



^-._^^i_ (/.. + , .^2..,)^ 



JTT 






= ~H 






e ^^^ cfe 



* * 4 



. . (xliv.). 



where 
nus 



A- 






yJ(NH) 



( ' 

\ 



roo 






e 



122 
«'' 



cZ^ 



1 r^3 







V 2 



e 



X^2 



(i;^ 



TT Jo 



^3 



2 



e « 



• 3 • 



Similarly 



XT 
X-LtJX Kj 



where 



^^/(NK) = t/rj 



X 

2 



• • • » • 



'I'l 



27r Jo ' ^^ A/2'7r Jo 



4^^ 



fs/ Lit Jo 



A = - 



(J0 



• • • 



. (xlv.). 

. (xlvL). 
(xlvii.). 



h — fh 



s/\ -- r^' 



«V ' / /t' / ■» • • • Y 

^g = "/XlT^ . . . . . . (xmu.;. 



and thus -^^ ^^^ ^i ^^^ ^^ found at once from the tables when ^^ and ^^ are found 
from the known values of r, /i, /c. 

Lastly, we have from Equation (xxi.) 

N^HK Jo ' 



or 



d 






1 r*' 

^TT J 



inus 






y, = 4/7,^r = ^NU, 



7o 



N 



XO: 



where 



X 



1 







27r \/ 1 — 



A1/< 



6 2(1 -r2) 



(42 ^. |,.3 „ 2f ;a-) 



/ T \ 



a value which can again be found as soon as r, ^, ^ are known, y^ = ;)(qN is clearly 
the ordinate of the frequency surface corresponding to x :=: h^ y ^= h 
Substituting in Equation (xliii.) we have^ after some reductions, 

* By Equations (ii.) and (iii.), cl -{- h and d + c are independent of n 
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Probable error of r = •67449or^^ 
•67449 \ (a + d){c •+ h) ^ , ^ (a + c) (d ~h b) (a + h)(d+ c ) 



+ ^3 '^1^ ■ + ^i' 



V^Nxo I 4N2 ' ^^ W ' ^1 N^ 

, ^ t , ad — he , ah '— ed . ao --hd]^ ,, v 

where Xo? ^i? ^^^ ^2 ^^^ readily found from Equations (xlix,), (xlvii.), and (xlviii.). 
Thus the probable error of r can be fairly readily found. It must be noted in using this 
formula, that a is the quadrant in which the mean falls, so that h and k are both 
positive (see fig., p. 2). In other words, we have supposed a -{- c > h -^ d and 
a -^ b > c -{- d. Our lettering must always be arranged so as to suit this result 
before we apply the above formula. 



§ (5.) To Find a Physical Meaning for the Series in r^ or for the e of Equation (xxi.). 

ff 1 J) 
Beturn to the original distribution -j^d ^^ P' ^' ^^ *^^ probabilities of the two 

characters or organs were quite independent, we should expect the distribution 



^ a -j-h a + e 



N 



N N 



^c -i- d a + G 



c -\- cl u -\-' d 



Now re-arranging our actual data we may put it thus : 



a 



d 



^ytt + ba + c 1 ad — be 


.j^ a + b b + d ad — be 


.^c + d a + c ad — be 


■y^c + d b + d ad — be 



Accordingly correlation denotes that — ^— has been transferred from each of the 

second and fourth compartments, and the same amount added to each of the first and 
third compartments. If 7; = {ad — hc)/W^ then 7; is the transfer per unit of the total 
frequency. The magnitude of this transfer is clearly a measure of the divergence of 
the statistics from independent variation. It is physically quite as significant as the 
correlation coefficient itself, and of course much easier to determine. It must vanish 
with the correlation coefficient. We see from (xxi.) that 

_ w TTTZ 

or we have an interpretation for the series in r of (xix.). 

Now, obviously any function of rj, just like rj itself, would serve as a measure ot 
the divergence from perfectly independent variation. It is convenient to choose a 
function which shall lie arithmetically between and 1. 
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Now consider what happens in the case of perfect correlation, ^'.e., all the observa- 
tions fall into a straight line. Hence if ad > &c, either h ov cm zero, for a straight 
line cannot cut all four compartments, and a and d are obviously positive. Thus c 
and h can only be zero if ly = (c + d){a + c)/N^ or {a + h){h + d)/^^. In order 
that b should be zero, it is needful that h and ^, as given by (iv.) and (v.), should be 
positive ora+c>& + c?><^ + ^><^ + c?? ^.nd the mean fall under the 45° line 
through the vertical and horizontal lines dividing the table into four compartments, 
i e,^h > h These conditions would be satisfied if ad > he and a > d^ c > b. Now 
suppose our four- compartment table arranged so that 

ad>bCy a>d, Ob, 
and consider the function 

Qi = sm--^j^^^^^P^, (li.), 

or 

Q, TT CtCv —06 /!•• \ 

1 = Sm r- 7 rr~z (in. ). 

This function vanishes if t; = 0, and it further ::=^ unity if 6 = 0. Thus it agrees 
at the limits and 1 with the value of the correlation coefficient. Again, when h 

and h are both zero, a = d,b •=• c, and Q^ = sin y "i^'ITl^ i^ "t^^^ ^ by (xx.). Hence 

we have found a function which vanishes with r and equals unity with r, while it is 
also equal to r if the divisions of the table be taken through the medians. 

Now, I take it that these are very good conditions to make for any function oi 
a, 6, c, d which is to vanish with the '' transfer,'' and to serve as a measure of the 
degree of dependent variability, or what Mr. Yule has termed the degree of 
'' association." Mr, Yule has selected for his coefficient of association the expression 



QLvLV "~~ UO /■% ... \ 

This vanishes with the transfer, equals unity if b or c be zero, and minus unity if a 
or d be zero. The latter is, of course, tmnecessary if we agree to arrange a, &, c, d 
so that ad is always greater than be. Now it is clear that Q2 possesses a great 
advantage over Q^ in rapidity of calculation, but the coefficient of correlation is also 
a coefficient which measures the association, and it is a great advantage to select one 
which agrees to the closest extent with the correlation, for then it enables us to 
determine other important features of the system. 

If we do not make all the above conditions, we easily obtain a number of coeffi- 
cients which would vanish with the transfer. Thus for example the correlation Tm of 
Equation (xxxviii.) is such an expression. "^ It has the advantage of a symmetrical 
form, and has a concise physical meaning. It does not, however, become unity when 

^ In fact (xxxvii.) gives us e = o-ko-krhk- 
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either, but not both, h and c vanish, nor does it, unless we multiply it by 7r/2 and 
take its sine, equal the coefficient of correlation when a ■=■ d and h = c. 

Again, we might deduce a fairly simple approximation to the coefficient of correla- 
tion from the Equation (xxiv.) for 0^ using only its first few terms. Thus we find 

^>i • ^~ CtLO ' "" (JO /I • \ 

, /tt (a + c) ~™ (& + d) 

where Xi = V ^ --.-^^-^-^ , 

/7r(a + b) — (c -^ d) 
^'=V2 N ' 

as an expression which vanishes with the transfer, and will be fairly close to the 
coefficient of correlation. It is not, however, exactly unity when either 6 or c is 
zero. But without entering into a discussion of such expressions, we can write 
several down which fully satisfy the three conditions :— 
(i.) Vanishing with the transfer, 
(ii.) Being equal to unity if & or c = 0. 
(iii.) Being equal to the correlation for median divisions. 
Such are, for example :— 

^' "^ ^'^ 2 ~^ad ^^hc ' ^^""-^ 

Q4 = sin I ^--~^— ^^— --^^-^^^^ ad>bc . , . • (Ivi.), 

(ad — he) (b + c) 

m^ zzr sni"-" j====^ »..,.,,.. (Ivn. ). 
, „ 4ahcd W^ 

where k = 



" {ad — bcf (a + d) {b -f c) 

Only by actual examination of the numerical results has it seemed possible to pick 
out the most efficient of these coefficients. Q^ was found of little service. The 
following table gives the values of Qg, Q3, Q^,, and Q5 in the case of fifteen series 
selected to cover a fairly wide range of values :— 
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No. 


r. 


h. 


h 


Q2. 


Q3. 


Q4. 


Qs- 


1 


•5939 + 


•0247 


- ^0873 




•4163 


•7067 


•6054 


•6168 


•6100 


2 


•5557 + 


•0261 


- ^4189 


— 


•4163 


'^d,^ 


•5657 


•5405 


•5570 


3 


•5529 ± 


•0247 


- ^0873 


— 


•0012 


'^^2S 


•5809 


•5699 


•5813 


4 


•5264 ± 


•0264 


+ ^2743 


+ 


•3537 


•6345 


•5331 


•5200 


•5283 


5 


•5213 ± 


•0294 


+ ^6413 


+ 


•6966 


•6530 


•5511 


•4878 


•5160 


6 


"5524 ± 


•0307 


+ 1-0234 


+ 


•3537 


•7130 


•6118 


•6169 


•6138 


7 


•5422 ± 


•0288 


+ -6463 


+ 


•5828 


•6693 


•5673 


•5136 


•5452 


8 


•2222 ± 


•0162 


+ -3190 


+ 


•3190 


•2840 


'22m 


•2164 


•2251 


9 


•3180 ± 


•0361 ■ 


+ ^1381 


+ 


•0696 


•3959 


•3185 


•3176 


•3183 


10 


•5954 ± 


•0272 


+ 1^5114 


+ 


•7414 


•7860 


•7100 


•6099 


•6803 


11 


•4708 ± 


•0292 


+ ^0865 


— 


•0054 


•5692 


•4712 


•4720 


•4715 


12 


•2335 ± 


•0335 


+ -0405 


+ 


•0054 


•2996 


•2385 


•2385 


•2385 


13 


•2451 ± 


•0205 


+ -2707 


+ 


•0873 


•3103 


•2473 


•2456 


•2470 


14 


•1002 ± 


•0394 


+ ^4557 


+ 


•1758 


•1311 


•1032 


•0993 


•1029 


15 


•6928 ± 


•0164 


+ ^5814 


+ 


•5814 


:8032 


•7108 


•6699 


•6897 



Now an examination of this table shows that notwithstanding the extreme ele- 
gance and simplicity of Mr. Yule's coefficient of association Q3, the coefficients Q3, 
Q4, and Q5, which satisfy also his requirements, are much nearer to the values 
assumed by the correlation. I take this to be such great gain that it more than 
counterbalances the somewhat greater labour of calculation. If we except cases (6) 
and (10), in which h or k take a large value exceeding unity, we find that Q3, Q^^, and 
Qg in the fifteen cases hardly differ by as much as the probable error from the value 
of the correlation. If we take the mean percentage error of the difference between 
the correlation and these coefficients, we find 



Mean difference of Q2 


24-38 


3er 


cent 


?5 


JJ 


Qo- 


3-95 




)5 


>> 


5? 


M. 


2-94 




5> 


?3 


J> 


Q5 = 


2-72 




)J 



Thus although there is not much to choose between Q^^ and Q5, we can take Q5 as 
a good measure of the degree of independent variation. 

The reader may ask : Why is it needful to seek for such a measure ? Why cannot 
we always use the correlation as determined by the method of this paper ? The 
answer is twofold. We want first to save the labour of calculating r for cases where 
the data are comparatively poor, and so reaching a fairly approximate result rapidly. 
But labour-saving is never a wholly satisfactory excuse for adopting an inferior 
method. The second and chief reason for seeking such a coefficient as Q lies in the 
fact that all our reasoning in this paper is based upon the normality of the frequency. 
We require to free ourselves from this assumption if possible, for the difficulty, as 
is exemplified in Illustration V. below, is to find material which actually obeys 
within the probable errors any such law. Now, by considering the coefficient of 
regression, rcrjcr^ = ^{'^y)/(^(yiO'2), as the slope of the line which best fits the series 
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of points determined as the means of arrays of x for given values of y^ we have once 
and for all freed ourselves from the difficulties attendant upon assuming normal 
frequency. We become indifferent to the deviations from that la w^ merely observing 
how closely or not our means of arrays fall on a line. When we are not given arrays 
but gross grouping under certain divisions, we have seen that the ^^ transfer" is also 
a physical quantity of a significance independent of normality. We want accordingly 
to take a function which vanishes with the transfer, and does not diverge widely 
from the correlation in cases that we can test. Here the correlation is not taken as 
something peculiar to normal distributions, but something significant for all distribu- 
tions whatever. Such a function of a suitable kind appears to be given by Q5. 

§ 6. On the ^^ Excess '[ and its Relation to Correlation and Relative Variahility, 

There is another method of dealing with the correlation of characters for which 
we cannot directly discover a quantitative scale which deserves consideration. It 
is capable of fairly wide application, but, unlike the methods previously discussed, it 
requires the data to be collected in a special manner. It has the advantage of not 
applying only to the normal surface of frequency, but to any surface which can be 
converted into a surface of revolution by a slide and two stretches. 

It is well known that not only the normal curve but the normal surface has a 
type form from which all others can be deduced by stretching or stretching and 
sliding. Thus in 1895 the Cambridge Instrument Company made for the instrument 
room at University College, London, a " biprojector," an instrument for giving 
arbitrary stretches in two directions at right angles to any curve. In this manner 
by the use of type-templates we were able to draw a variety of curves with arbi- 
trary parameters, e.g,, all ellipses from one circle, parabolas from one parabola, 
normal curves from one normal curve template. Somewhat later Mr. G. U. Yule 
commenced a model of a normal frequency surface on the Buill system of inter- 
laced curves. This, by the variable amount of slide given to its two rectangular 
systems of normal curves, illustrated the changes from zero to perfect correlation. 
This model was exhibited at a College soiree in June, 1897. Geometrically this 
property has been taken by Mr. W. F. Sheppard as the basis of his valuable paper 
on correlation in the ' Phil. Trans.,' A, vol. 192, pp. 101-167. It is a slight addition 
to, and modification of, his results that I propose to consider in this section. 

The equation to the normal frequency surface is, as we have seen in § 1, 









Now write xl{(Ti\/l — r^) = x\ yjo-^ = y\ This is merely giving the surface two 
uniform stretches (or squeezes) parallel to the coordinate axes. We have for the 
frequency of pairs lying between x, x + 8x, and y, S + Sy, 
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N 



zSxSy = 2~ 8x'8y' expt. 



X 



ry 



v/1 



— '?'•* 



■\-y'' 



Now give the area a uniform slide parallel to the axis of x defined by r/y^l— r^ 
at unit distance from that axis. This will not change the basal unit of area 
Sa = Zxhy\ and analytically we may write 



X 



X 



y^r/Vl -r\ Y = y', W = X^ + Y^. 



Whence we find 



N 



zSxSy =: w-Sa expt. ( — pi^). 



This is the mechanical changing of the Yule-Beill model analytically represented. 
The surface is now one of revolution, and the proof would have been precisely the 
same if we had written in the above results any function/, instead of the expo- 
nential.^ It is easy to see that any volume cut off by two planes through the axis of 
the surface is to the whole volume as the angle between the two planes is to four right 
angles. Further the corresponding volumes of this surface and the original surface 
are to each other 8.s unity to the product of the two stretches. Lastly, any plane 
through the 2;-axis of the original solid remains a plane through the 2;-axis after the 
two stretches and the slide. These points have all been dealt with by Mr. Sheppard 
(p. 101 et seq., loc. cit). I will here adopt his notationr = cosD, and term with him 
D the divergence. Thus cot D is (in the language of the theory of strain) the slide, 
and D is the angle between the strained positions of the original a; and y directions. 
Now consider any plane which makes an angle x with the plane of xz before strain. 
Then, since the contour lines of the correlation surface are ellipses, the volumes of 
the surface upon the like shaded opposite angles of the plan diagram below will be 
equal; and if they be n^ and ?^2, then n^ -{- n^=i |-N. If n{ and Uc^' be the volumes 
after strain, then by what precedes we shall have 



n^ = cr^or^x/l — r^ X n^', 



7lc 



= o-icr^x/l —r^ X n^, 



and 



(^2 - ^i)/(^i + ^^2) = « — K)I{K + ^^2')- 



A ^ 



¥y 



fO? 



h 



rt 



— 1^ 



n. 



x 



X 



¥^ 



1 







'K 



N. 



I 



"S. 



h 



^ 



-N. 



:s. 



-y 



-^ 



-x 



B 



"^ The generalisation is not so great as might at first appear, for I have convinced myself that this 
property of conversion into a surface of revolution by stretches and slides does not hold for actual cases 
of markedly skew correlation. 

d2 
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Now fii and %' will be as the angles between the strained positions of the planes 
bounding Uj and %. Ox does not change its direction. Oy is turned through an 
angle 7r/2 — D clockwise, and x becomes x\ ^^J- Hence 

< : < : : f - x'' + f - D : I + x'' •-" 1 + D. 

2 
or « - ^^lO/(%' + ^hO ^ :^ (x" + D) ^ 1. 

Let us write E^^ = 2{n^ — Uj) and term it the excess for the ^/-character for the 
line AB. Then we easily find : 

tan ^H 2 + 2/ ^ *^^^ (X + i>) - ^-^^^-^^FD^ZTi » • • • (l^m.)- 

It remains to determine tan x' and substitute. The stretches alter tan x into 
tan x\ such that 



tan x' =^ ~~~^ — ^ ""^ tan x« 



^3 



Further, by the slide 



cot x" ^" cot x^ """ cot D ===: — -=:l=r= cot X — cot D. 



0-1 v^l -^ r 



Hence we have by (Iviii.) above 

- °°^ (n t) = ^;7ftrp «°* V(ovft--^ '^* X cot D ™ cot^ D - 1 

or, 

-tanfljWcotD-^i*^ (lix.). 

Now the excess Ej^ is the difference of the frequencies in the sum of the strips of 
the volume made by planes parallel to the plane yz on the two sides of the plane ABz 
(defined by x)? taken without regard to sign. For on one side of the mean yy this is 
^2 — nj, and on the other --{nj—i^)^ Hence we have this definition of E^^, the 
column excess for any line through the mean of a correlation table : Add top the 
frequencies cibove and heloiv the line in each cohtmn and take their differences ivithout 
regard to sign^ and their sum is the column excess. 

If we are dealing with an actual correlation table and not with a method of 
collecting statistics, then care must be taken to properly proportion the frequencies 
in the column in which the mean occurs, and also in the groups which are crossed by 
the line. It is the difiiculty of doing this satisfactorily, especially if the grouping, as 
in eye and coat colour, is large and somewhat rough, that hinders the effective use of 
the method, if the statistics have not been collected ad hoc. 

Now let E2 be the roiv excess for the line AB, defined in like manner, then we have 
in the same way 
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tan ( '^^ 



/E^TT 

U 2 



. -r>, 0-g cot V 

cot D — - ■' ^ 



o\ sin D 



. (lix.^^«). 



Now eliminate o-^ij^-y between (lix.) and (lixJ^^^) ; then 

(tan ( ^ ;^ ) + cot D) (tan ( ^ ^ ) + cot D) 



¥ 2 



N 2 



siii^ D * 



Whence we deduce 



cotD 



and, therefore, 






T 



cosD 



COS 



Et + Eo TT 



N 



2 • 



. (Ix.). 



Substituting for D in (Hx.) we find further 



cr 
cr 



-^ =: cot X COS 



'E. 



TT 



N 2 



COS 



'E| IT 

N 2 



. (Ixi.). 



Thus Equations (Ix.) and (Ixi.) give the coefficient of correlation and the relative 
variability of the two characters. The latter is, I believe, quite new, the former novel 
inform. 

If we call m^ the frequency in the angle y^ (KOx of the figure above), then it is easy 
to see that Ej^ = 2(7^2 — n^ =z: N — i.n^, and similarly E^ = N —- ^m^ Thus 
(Ej^ + E^)/N = 2(N -— 2(ni + mj^))/N. But n^ + m^ is the frequency in the first 
quadrant. This Mr. Sheppaiid terms P, while that in the second he terms R. We 
have thus (E^ + ^^)[^ == 2R/(R + P), or 



E 



.x 



T =: cos 



E -V P 



IT . 



. (Ixii.), 



Le,, Mr. Sheppard's fundamental result'^' (' Phil. Trans.,' A, vol. 192, p. 141). 

We can, of course, get Mr. Sheppard's result directly if we put x == 0, when we 
have at once E^^ = 2(11 — P), E2 = N = 2(11 + P), and the result follows. 

Equation (Ixi.) may also be written in the form 



-i = cot y sin { 'zz2it 



n^ 



Sm ( :ri 27T 



. . . (Ixiii.). 



If we put X ~ 0? then m^ becomes zero, and the right-hand side of (Ixiii.) is 
indeterminate. If we proceed, however, to the limit by evaluating the frequency in 
an indefinitely thin wedge of angle x? we reach merely the identity (tJctc^ = ctiIg'^, 
Hence there is no result corresponding to (ixi.) to be obtained by taking 
Mr. Sheppard's case of x ~ 0. 

The following are the values of the probable errors of the quantities involved : — 



"^ In the actual classification of data (Ix.) and (Ixii.) suggest quite different processes. We can apply (Ix.) 
where (Ixii.) is difficult or impossible, e.g., correlation in shading of birds^ eggs from the same clutch. 
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Probable error of Ej = '67449 a/N {1 - E//N^) .... (Ixiv.). 



>3 5) 



E^ = -67449 x/N (1 - E//N^) . , . . (Ixv.). 



Correlation between errors in E, and E^ = ■— A / S't-~--t^tiJAi~-~-^t>Js • • (Ixvi.). 



^ , TT . -67449 sin Dv^D (tt — D) ., .. . 

Jrrobable errorin r = ^^ — _--_— ^ ^ _ ^ (Ixvn.), 



where D = -^~^sF~^ o (^/ Sheppakd, loc. cit,, p. 148). 
Probable error in ratio crja-c^ = 



'E. 



yiT a, 2 U^ N^ j*^^ U 2 J + 1,^ nJ*^'' VN 2, 



TT 



•^ 



+ ^(1-nJ(1-nJ*^H^'2J^-Hn2/J 



1. 
2 



. (Ixviii.). 



The application of the method here discussed to statistics without quantitative 
scale can now be indicated. If the characters we are dealing with have the same 
scale, although it be unknown, then, if the quantitative order be maintained, t.e., 
individuals arranged in order of lightness or darkness of coat or eye-colour, the 
diagonal line on the table at 45"^ will remain unchanged, however we may suppose 
parts of the scale to be distorted, for the distortion will be the same at corresponding 
points of both axes. Further, if we suppose the mean of the two characters to be the 
same, this 45° line will pass through that mean, and will serve for the line AB of the 
above investigation. In this case we must take tan ^ = 1^ and consequently (Ixi.) 
becomes 

o-i/cr^^: cos(~- -j/cosf j^-j (Ixix.). 

We can even, when the mean is a considerable way off the 45° line, get, in some 
cases, good results. Thus, the correlation in stature of husband and wife worked out 
by the ordinary product moment process is '2872. But in this case Ej = 382*062 
E^ = 806*425, and this gives the correlation '2994. On the other hand, the actual 
ratio of variabilities is 1*12, while (Ixix.) makes it 2'76 ! This arises from the fact 
that the errors in Ejl and Eg, due to the mean being off the 45° line, tend to cancel in 
Ej + Eg, but tend in directly opposite directions in the ratio of the cosines. Similarly 
the correlation between father and son works out '5666, which may be compared with 
the values given in Illustration V. below, ranging from '5198 to '5939. Again, 
correlation in eye-colour between husband and wife came out by the excess process 
•0986, and by the process given earlier in the present Memoir '1002. But all these are 
favourable examples, and many others gave much worse results. We ought really only 
to apply it to find o-Ja-^ when the means are on the 45° line, as in the correlation of the 
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same character in brethren, and even in this case the statistics ought to be collected 
acl hoc, ie,, we ought to make a very full quiantitative order, and then notice for each 
individual case the number above and below the type. For example, suppose we had 
a diagram of some twenty-five to thirty eye tints in order (e.g., like Bertrand's), 
then we take any individual, note his tint, and observe how many relatives of a 
particular class — ^brethren or cousins, say — have lighter and how many darker 
eyes ; the difference of the two would be the excess for this individual. The same 
plan would be possible with horses' coat-colour and other characters. After trying the 
plan of the excesses on the data at my disposal for horses' coat-colour and human eye- 
colour (which were not collected ad hoc), I abandoned it for the earlier method of 
this Memoir ; for, the classification being in large groups^ the proportioning of the 
excess (as well as the differences in the means) introduced too great errors for such 
investigations. 

§ 7. On a Generalisation of the Fundamental Theorem of the Present Memoir. 



If we measure deviations in units of standard deviations, we may take for the 
equation to the correlation surface for n variables 



N" 



z 



(27ryVS 



^l{s,(^.^^2s4^^}^ . . . . . (Ixx.)^ 



where 



13 , 

Xi» 



1 



'IZ ' 13 • • • * * * In 

A. / 23 •••..•••••' 2rt 



• • • 



. • • 



• • . • 



«.**«* 



..4. »'*•••«... 



»*««.« 



» • * 



* . • • 



f n 



r 



n—l, I 
n, 1 



' n — 1, 2 ' n—l, 3 • • • -** 



r 



n—l^ n 



^n, 2 



r. 



%» 3 



» . . . I n^^l^'ji J- 



and 'Rpg is the minor obtained by striking out the pth row and qth column, rp^ is, of 
course, the correlation between the pth and ath variables, and equals r^p. S^ denotes 
a summation for 5 from 1 to n, and S^ a summation of every possible pair out of the 
n quantities 1 to ti. 

Now take the logarithmic differential of z with regard to r^^. We find 



1 dz 



1 dB> 



z dr. 



n 



ZilX U^Tmfi 



n 



JLu 






X-hmif jti/ 



d /E 



'SS \ 9 

tAJf, 



"■*" ^C)i> 



P^ 



w i x\) 



'SS' 



(A/lpff \ Xl) 



Irv* /y» 



S^*'$> 



■^^ + S,(»a./) + S, 






^ps •^*'qs' 



fs' ■^^'qs 



w 






S*^S' 



For 



d'R/dr 



M 



2R 



'm 
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and, generally, whether s is or is not = s', or these are or are not — p and q, we 
have 



tv I XXiQ 



'8S! 



!> P i_ p p 



:,pgl X-\!qs 



Lv I inn \ J-V 



E^ 



19 



I l.JV_?Vi» lo 



This follows thus : 



Cl I Xto 



^^.?' 



1 m 



'SS' 



do 



n 






E 6^^ 



i?? 






PI 



1 fZE, 



'.«/ 



It C/r,: 



i^^Z 






or we have to show that 



cZE 



^ssi 



9P p 



ch 



n 



±\ips ^qs' -^^ps' -^^'(js 



Xh 



XXss' i-^pq i^ps ^^qs' . ^^ss' ^pq *""' ^^ps' ^'^qs 



E 



^ 



p 



where j^^R^^/ is the minor corresponding to the term Tp^ in R^^/, and ^^jR^^/ the minor 
corresponding to the term r^^f^^ But this last result is obvious because R^^/ only con- 
tains Tpf^ in two places, i,e.^ as Tpq and r^^. 

Putting 8 =/5 we have the other identity required above, ^.e.^ 



d /E 



ss 



dr„r, \ E 



i^^Z 






1 ^;<: 



Returning now to the value for - 77- on the previous page, we see that the two 
sum terms may be expressed as a product, or we may put 



1 dz 



% di 



p /p 



PI 



Pt 



E 



/\ Oj 



'E 



qs 



E 






Now write 






(27ryVE 



e 



-</, 



Then 



d^ 



dx. 



= Si 



Ii,i 



'p 



E 



7M 

- X 



Hence 



s \ ? 



1 dz 






z dr. 



d^<p d(p d(j} 



n 



KAjJUm LVtA^fi 



LvJOm LVJUfj 



Now differentiate log z with regard to x^. Then 



dz d(p 



LvJUi 



'$ 



(JjlAJi 



V 



"^ See also Scott, ^ Theory of Determinants,^ p. 59, 
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dh d^(f> dz del) 



2 



Thus finally 



(JvJ(jin(A/ih(i CtiiXjinQj'iXjQ tttf^fi tviAJ-n 



1 (f% #^ dj> d^ 



Ctev Lb ^ /l * * \ 

.......... (Ixxii.). 



tvf mq LvJljmH/JL'n 



In other words, the operator djdTpd acting on z can always be replaced by the 
operator (Pjdxpdxq, 

Let d/dppq denote the effect of applying the operator d/dr^r/ to z^ and putting r^^ 
zero after all differentiations have been performed, then the effect of this operator will 
be the same as if we used d^/dx^dx^ on z, putting r^g zero before differentiation. 
Generally, let F be any series of operations like d/dvprp then we see that 

Fl €(/ Lb Cb \ 

I J - J — ...... j ^ 

\ Pi P'f£ ClTpiqii j 

{ d^ d^ d^ \ TsT 

= '^1- ^—^ — — ^ -r^ ^ ......) --~^=--~^ 

\ CvtXj.niXi/jq CwjpiCwCfti Ctt/jp/iCbiAyfi// J K iuTT ) 

Now let F be the function w^iich gives the operation of expanding z by Maclauiun's 
theorem in powers of the correlation coefficients, i,e,, 

F = (^h^ j~i)^ 
then 

This is the generalised form of result (xiv.) reached above. 

N 
Now let z. = rr-— ^ e "" ^^'^'^^ , 

then z^ is the ordinate of a frequency surface of the nth order, in, which the distribution 
of the n variables is absolutely independent. We have accordingly the extremely 
interesting geometrical interpretation that the operator 

aj)plied to a surface of frequency for n independent variables converts it into a surface 
of frequency for n dependent variables, the correlation between the sth and sth 
variables being r^^u"^ 

"^ I should like to suggest to the pure mathematician the interest which a study of such operators would 
have, and in particular of the generalised form of projection in hypers|)ace indicated by them. 

VOXj, 0X0V« — ^A, Ji< 
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Expanding, we have 



z 






'S^W-Jgl 






r 



ss' 






2 

> z. 



_jt.t,W_j 



+ . . . + 



m 



fP 



/y» 






S^W'^S' 



^0 + 







9 6 







(Ixxiii.). 



Our next stage is to evaluate the operation 



Si f 



# 



m 



iCi 



\ tv'Jj'd.ilJ(j<il I 



'-'S'-'^^Si , 



Let us put 



^^1 



Qu, 



3J 



V. = X, 



s 



s^2 






^^3 



and ^Vp =^ the j3th function of x^ as defined by (xv,). 



x,{x,^ — 3), 



Let e^ be a symbol such, that e/ represents ^Vp. Then we shall show that 



# 



m 



/y> 



2v <y^ r/'Y- fl 



•g^vXgf 



z 







^ 
^(^ 



«i 






1 ^-'r 



'^ ss'^s^s' 



5 f » 



We shall prove this by induction. 
By (xii,) 



s^m-i-i "~~ ^■'' ■'^^' 



, ,t?,;2 — m sVm .»15 



or 

and by (xvi.) 



«?-f 1 



^ e/^ — m £, 



'S ^s 



m-^ 1 






'^ ^'?^OT._25 



or 



(h^'' 



iviX' <f 



m €/■ 



NoWj let X i^s) he any function of e^ 



S(A^€/), 



if we suppose it can be expanded in powers of e^. 
Then 



^ a*5S(A^ e/) — 



^5 // 



€, S(A 



^ *-*■ 



') 



Similarly 

Now suppose that 



<P 



(jV'JjijtviA^gl 



(e, , e,) ^ {x,--e.){Xs> — es) xk^s, ^s) 



(P 

^^^ '"'""d^^,, 



2;^ — - %){ feo\'^%*'€^^.y7 j' 3 



, (Ixxiv. ). 



9 S » 9 & • 



• g 



(Ixxv.). 
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then 



S 



cP 



3\ ^W 



Lhtu ^LvJuqI 



?n + l 



z. 







s 



cP 



21 ^'.y*' 



LvJL'^H/JUqi 



\zJ] 



o^-> 



where U stands for {^<^{Tss>es^.s)Y\ 
Hence, remembering that dzjdxs = 






So ( r,,, 



d? 



m+l 



CvJbgCvuOg! 



H =^ Z^^{Tss>XsXs)\^ + ^qS?, (■^' 



rZ^U 



w' 



^u 



ZcBAt.ax,'-- + a:^^/ 



^0^2 \ ' ss'\ ^s 



dx 



S' 



Lvt/ji, 






which had to be proved. 

But it is easy to show by simple differentiation that 



d^ 



Y CvJUgQ/JL'gi '/ 



^0 = ^0 ^^li'^ss' s'ih s'^^l) = % S2(n./e,€,.) 



^2 ^^ss' 



d^ 



S 



Ct'tA/^'vMy^'/ 



^ 







^0 {^2{^'^ss'^s^s>)V' ' • » . ...... . , (Ixxvii.). 



Hence the theorem is generally true. 
Thus we conclude that 



Z =: Z 







1 + Sa(7V6,€,/) + 2 1 S^{r,,,€,€,) ^ 



12 






. + 



mi 



>^2V^^'^^-^') 



+ 



* * 



. (Ixxviii.), 



It is quite straightforward, if laborious, to write down the expansion for any number 
of variables. 

Now let Q be the total frequency of complices of variables with x^ lying between 
hj and 00 , x^ between h^ and 00 , . . ^x^ between h, and cx) , . . . x,, between h,, and 00 ; 
and let Qq be the frequency of such complices if there were no correlations. 
Then 

fCO /'CO »00 /•OO 

• • . I » . . <v (JoX-i CLXo , . . OjXg , » , (X/X^i 

-co J.00 /.CO y.00 

tjO I . . . I a . . I -VQ VuX-i CLXcy . . . COXg . . . d/Xji 

1 f^ i 



Now let 



A/27r J 7^, 
E 2 
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11, 



v/27r 



e 



2'''' 






We have 

But by (xviii.) 



«5o *~~~' r^lr^Q * ' r^<? * * P^t -^^^l -^^^^S " ' -O.^ » • XX;^. 



A (XI 



where 



and as above. 



\/2 






/is 



.V;; e' 



4%3 



Ps 



<S /i 3 



y^i^-1 — b%J-<-lJ;% = /i. 



s i « 



7i. 



e '^""^dx, 






Z S G 



J i X.A.yvx3t» 111 

a* \ 
-CT" -1L7' '-rir' 1 » 

-<\. j<^ -o>, X # l« 



Thus 



/.OO /-co 



(v/27r)^^ 



/i, 



/.CO pOO /.CO 

I .9.1 «.»l s^p s' ^p' s"^p'^ . » « G " Ci'a3|CC'(A^rt « « 8 (AjOC^ e » « UjQjji 



.'y., 



fVn/^l xii'^r, 



|T J-J T-J H iS iff iff 5 '^''^^"1 ^'■''^^'--1 "''''i'"-^ 

ft ft' ft'" 



or 



fOO rtOO «oo «00 

I •«•! «»»l 'VnlXl^t/^ I Cvt/yiCvt^o » • * 



L-vJOi 



e Vv'tA^ 






/^ 



QoH ( l|ri 



e 5 



e . I iX-?C2k.ll, ft 



where n denotes a product of ^p for any number of i;'s with any s and ^. The rule, 
therefore, is very simple. We must expand the value of z in v'b as given by (Ixxviii.) 
above, then the multiple integral of this will be obtained by lowering every ^^'s right- 
hand subscript by unity (remembering that /^q = 1), and further dividing by the of 
the left-hand subscript. The general expression up to terms of the fourth order has 
been written down ; it involves thirty-four sums, each represented by a type term 
All these would only occur in the case of the correlation of eight organs, or when we 
have to deal with twenty-eight coefficients of correlation. Such a number seems 
beyond our present power of arithmetical manipulation, so that T have not printed the 
general expressions. At the same time, the theory of multiple correlation is of such 
great importance for problems of evolution, in which over and over again we have 
three or four correlated characters to deal with,'^' that it seems desirable to place 
on record the expansion for these cases. I give four variables up to the fourth and 
three variables up to the fifth order terms. Afterwards 1 will consider special cases. 



^ In my memoir on Prehistoric Stature I have dealt with five correlated organs, ie., ten coefficients. In 
some barometric investigations now in hand we propose to deal with at least fifteen coefficients, while 
Mr. Bramley-Moore, in the correlation of parts of the skeleton, has, in a memoir not yet published, dealt 
with between forty and fifty cases of four variables or six coefficients. 
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Value of the Quadruple Integral in the Case of Four Variables.* 
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* To simplify the notation, %\ %'\ v^'^ t;/^ haye been used for 1%, 2%, 8%, 4%. 
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^ AA/8s/34 ^ ^ ' 1 ^ ^1/33/3,, 1 ^ 1 ^ /3i/33/3, 1 1 3 -r ^^^^^^ i 3 1 

+ ^,/33^, ^^ ^^ ^^ + A/3,A/3. ' ' ^ /3iA/33/3. "^^"^^ + m^A '''''' 

+ /3,/3,A /'^'^'"i + A/3.^3 '^ ' ^ A/32^3 '' ' 

+ '^;w: '^ ^ + ^.A/33^, "^^'^ "^ -^^ /3^/3,/33/3, "^^^ "1 

+ AA/3b/3,. ^^ '■ ^ m,m, ' ' ^ mSA, ' ' ' 



" IV 
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^ A/SpAA ® ' 1 ^ A AAA -^^ ^ ^ AAA A ^ ^ ' 

, 12?'i3^ri4'r34 , ,„ ■ . 12ri3rj^-'r34 , „, ; 129j3ri£3j , ,„ j 

+ "AAsr ^^^^^ ^^' + AAA ''''^' ^ AAA ^'^"'^ "^^ 
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+ AAAA ' ' ^ AAAA ' '' ^ AAAA ^' 

+ ^AAM' ^' '' ' "^ AAAA ''' '■' '' + AAAA '' '' 

+ AAAA ^ ' ' ^ AAAA ^ ' ^ AAAA ' ^ ' 

12^-13^^24^34 ; ,,, . • 12^-13^24 ■% ,, ,„ . 12r^£3£3^^ ,„ .^ 

+ AAAA ^^^^ "^ + AAAA7^^ '' "'' + AAAA "' .^^ 
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HlH^P'i>H4. P1P3PSP4 P1P2P8P4 
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P1P3P3P4 PiPgPgP/fc P1P2P3P4 

/I •«• \ 

In the case of three variables, we must cancel in the above expression all terms 
involving ^^ Thus we shall have 3 instead of 6 first order terms, 6 instead of 21 
second order terms, 10 instead of 56 third order terms, and 15 instead of 126 fourth 
order terms— a much more manageable series, 

I give below the extra term necessary for calculating the value of (Q — Qo)/Q() as 
far as the fifth order terms in the case of three variables. 



Fifth Order Terms for 'Three Variables. 
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•3 A., r.. 3 ., 3 'mr q. 3 ,^,. 3 ' 1 
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A numerical illustration of these formulae will be given in the latter part of this 



\ 



Memoir. It will, however, be clear that what we want are tables of log ( '^ !, including 
log r^\ or log f^j for a series of values of h. Such tables would render the compu- 

tation of -—rr — ^ fairly direct and rapid ; they could be fairly easily calculated from 



existing tables for the ordinate and area of the normal curve, and I hope later to 
find some one willing to undertake them. 

Meanwhile let us look at special cases. In the first place, suppose, in the case of 
three variables, that the division of the groups is taken at the mean, i,e,, \ =z hc^z=z 
h^= 0. Then we have 



^j = ^2 = ^3 = e-^^ dx — /^ |- . 



V, 



V 



V 



tf 



f1 







/// p. 



% 



= %' = % 



/// 






v^ == 1)3" = -y/" = 



'3 



^4 = ''h 



N 



V. 



tff 



— 3. 



Hence we have 



£i 



J J/' cfoi dx^ dx.^ = Q J 1 +^ (ri3 + r-^^ + ^33) 



1 r 2 11 



TT 



2 



fjj- 



9 (7-236 + rg^s _|. ^^^5) 



+ • . . 






Qo 1 1 + ^ (sin ^ Tia + sin"' r^^ + sin"! r^g) 



• • • • • I X-A^A.-A. V t ft 



Let Tj^ = cos D12, Ty^ = cos Djg, r^g = cos D,^,g, and let E be the spherical excess of 
the spherical triangle whose angles are the divergences D^^, D^g, D33. Then 
we have 



Q — QoTT 



•TT 



Ho 2 



— T) 

2 '^^ 



D|3 — Dc^ 



28 



TT 

2 






Or : 



. Q — Qo '^ 
sm - 



Q 







cos E 



Now take the case of four variables. Here we have 



. . (Ixxxvi.). 





A ^, 13,- fi,- a/i 




V.^ = V," - V,'" ~ V,- - 1 




<-%"-<" %" 3, 


and all the odd v's zero. 


Hence 
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'% 



" Q-^ = 7 ('"12 + ''13 + ^4 + ''23 + ^24 + ^-B*) + [~) ('''u^n + ^12^34 + ^B^m) 



+ ^- 1! ('"12^ + '"13^ + ''14^ + ''23^ + r^i^ + roj) + ( ^ ) (^^2^13^1,1 + l^r^s^i 



2\2 

TT 



24 



/ 



/ 2 \2 1 

+ '"ib'-SS^ + ^']4r24r34) + ( ~ ) [3" ('"14^^23 + ''l4^23^ + ^13^^24^ + n3^^24 + ''la^^g^ 



+ ^^12^34^) + 



2\2 1 



TT 



2 (^'^12^^^4^''23 + ^13'^14^^23 + ^\^^1^'h + ^13^^/^^S4 + ^^13^^23^^'24 



2^. 



2, 



2^, 



„ 2 



2., 



+ ^14^^23^'24 + ^12^13^4 +^12^14'^34 + ^12^23^'3I + ^14^^23^' 34 + ^''l2^^24'^^34 



"i" '^13'^"24^34 ) + 



S 9 « « • 



■ 9 e • • • IxA^^ Vli, jt 



This is the correct value including terms of the fourth order, but to this order of 
approximation we can throw it into a much simpler form. Let r^s> ==: sin 8^^/, then 



Q — Q o 'TT 

"" Qo ^^ 2 



sin~^ 7^13 + sin ^ r^g + sin'"^ Tj^ + sin ^ Vo^, + sin"™^ ^^^^^ + sin~^ r 



34 



+ — (sin""^ r;|^3 sin""^ r^g sin~^ r^^^. + sin""^ r^^ sin"^ ?^23 ^in""-^ -?' 
+ sin"^ Tig sin"^ rgg sin~^ rg^ + sin""^ r^^^ sin'"^ r^^ sin""^ ^'•g^) 



24 



2 



r..^^) (1 ^ r,/) (1 



12 



+ -^- [sin-'iri4,sin-ir23{(l - 

+ sin-^ ri^sin-i ^3^(1 - 7\^^) (1 - r^.^) (1 - 
+ sin"i ^13 sin~i ^'^^(l - r^/) (1 "™ r^/) (1 » 

8] 2 + §13 + §14, + §23 + §24 + 834 

2 

+ ^^j: (§12813814 + 812823824 + 813823834 + 814834834) 



^^24^) (1 - 

" ^^23^) (1 



-^34^)]"^' 



+ 



2 /S14S23 COS Si4 cos §23 + 813834. cos §13 COS 834 f 813804 COS B^^ COS 8g^' 



TT 



COS §13 COS 813 COS 8i4 COS 833 COS 804 COS S34 



We may write this 
where 

E' = -TT — 813 — Si3 



sm -- - -- =: cos J^^ 



« s « 



V « * V 



(Ixxxviii.). 
(Ixxxix.) 



■"^0 



TT 



^14 ■"" ^23 



3 — 80,1 



34 



2 



TT 



(812813814 + 812833834 + S13833834 + 814834S34) 



2 /814803 cos 8|4 cos 833 4- 8^3834 cos 813 cos 834 + S13834 cos 813 cos 834' 



TT 



COS 813 cos §13 COS 814 cos S33 COS 834 cos 834 



The expressions E and E' of (Ixxxvi.) and (Ixxxix.) are of considerable interest, for 
they enable us to express the area of a spherical triangle in three-dimensioned space, 
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and (up to the above degree of approximation) the volume of a '' tetrahedron " on a 
*' sphere '' in hyperspace of four dimensions. In fact, the whole theory of hyperspace 
^^ spherical trigonometry" needs investigation in relation to the properties of multiple 
correlation. 

In our illustrations (viii.) and (ix.) will be found examples of the above formulae 
applied to important cases in triple and quadruple correlation in the theory 
of heredity. I consider that the formulae above given will cover numerous novel 
applications, for many of which greater simplicity will be introduced owing to the 
choice of special values for the /I's or for the correlation coefficients. 



(8.) Illustrations of the Neio Methods, 

Illustration /. Inheritance of Coat-colour in Horses, — The following represents 
the distribution of sires and fillies in 1050 cases of thoroughbred racehorses, the 
gi'ouping being made into all coat-colour classed as '' bay and darker," '' chesnut and 
lighter " : — 





Colour. 




Sii 


'es. 




Bay and 
darker. 


Chesnut and 
lighter. 


• 

I— < 
»— ( 


Bay and darker . . . 


631 


125 


756 


Chesnut and lighter , 


147 


147 ! 


294 






778 


i 


272 


1050 



a 


h 


a + h 


c 


cl 


c + d 


a + c 


h ■\- cl 


N 



Then we require the correlation between sire and filly in the matter of coat-colour 
and also the probable error of its determination. 
We have from (iv.) and (v.) 

^1 = ^- V- "= VH- ]fd^ = -481,905, 



a, 



{a -\- h) — {c 4 d) 



V-^fo 



o~i'J'Aoi — • 



e~i'-''di/ 



440,000. 



Hence from the probability integral tables 

h — -64630. 



k = -58284. 



We have then : log HK = 1-037,3514 by (xvii.), 

F 2 
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JL lltJliv-it? 



ad — ho 



WIIK 



' = '619,068 from (xxL). 



Calculating out the coefficients of the series in r in (xix.) we find 
•619,068 = r + •18B,345r^ + •064,0814r^ + •107,8220r^- + 'OOS^ggser^ + •067,2682r« 

Neglecting powers of r above the second, we find by solving the quadratic and 

taking the positive root 

r =: -5600. 

Solving by two approximations the sextic we finally determine 

r = -5422, 

correct, I think, to four places of figures. 

Turning now to the probable error as given by Equation (1.), I find 



and from (xlix.) 



h^ ^ k^ ^ 2rhk = '348,924^ 



logxo== 1-170,0947, 



JL Ltx uHt5i • 



k — rh 



Vi 



9 



•275,642 , 



h — rk 



yi 



•393,078. 



Hence from (xlvii.) and (xlviii.) we find 



^1 



] rdm.on 



v 27rJ Q 



e """ 






^3 



1 f -275,642 



y^27r. 



e 



■IMz 



and by means of the probability integral table 

t/zi = -108,884, t//3=: 152,865, 

By substituting in (1.), we find 

probable error of r = '0288. 
From (xxxiv.) and (xxxv.) we find 

p.e. of h = '0282. 



p.a of /f = -0278. 



Thus, finally, we may sum up our results 

h = -6463 ± '0282, h = "5828 ± "0278, 

r = -5422 ± -0288. 

The probable error of this r, if we had been able to find it from the product 
moment, would have been '0147, or only about onedialf its present value. 
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Illustration II. — Our analysis opens a large field suggested by the following 
problem: — What is the chance that an excejMonal man is horn of an exceptional 
father ? 

Of course much depends on how we define '^ exceptional," and any numerical 
measure of it must be quite arbitrary. As an illustration, let us take a man who 
possesses a character only possessed by one man in twenty as exceptional. For 
example, only one man in twenty is more than 6 feet 1 '2 inches in height, and such a 
stature may be considered '' exceptional." In a class of twenty students we generally 
find one of '' exceptional " ability, and so on. Accordingly we have classed fathers and 
sons who possess characters only possessed by one man in twenty as exceptional. We 
first determine h and /r, so that the tail of the frequency curve cut off is -g-Q of its 
whole area. This gives us h = h = 1*64485. 

1 1 . 

Next we determine HK = 7^-"e"'*^''''*"^''^ and find log HK = 2*026,8228. 

Then we calculate the coefficients of the various powers of r in (xix.). We find 

logiM=: •131,2225. 
logi(/i2-l){F-l)=: 1*685,5683. 

log- (^2 - 3) (^2 - 3) = 3*990,1176. 

log xio {h'^ — 6^^ + 3) {¥ - 6F + 3) == 1-464,4772. 

log^Jh^ - lOh^ + 15) {¥ - lOF + 15) = 2-925,6367. 

It remains to determine what value we shall give to r, the paternal correlation. It 
ranges from *3 to *5 for my own measurements as we turn from blended to exclusive 
inheritance. Taking these two extreme values we find 

ad — Ic 



W 



= *0046344 or -0096779. 



^ ad-bc d (d+h)(d + c) i .i i . • xi i. p x- i 

JDut ^3 ~ "isj ^ 7^3 > ^^^ ^^^ second term is the chance oi exceptional 

fathers with exceptional sons, when variation is independent, i.e., when there is no 
heredity, = ^o X 2~o = *0025. 

Thus c^/N = -007184 or -012178 ; 

accordingly 6/N = '042866 or '037822. 

Hence we conclude that of the 5 per cent, of exceptional men *71 per cent, in the 
first case, and 1*22 per cent, in the second case, are born of exceptional fathers, and 
4*29 per cent, in the first case and 3*78 per cent, in the second case of non-exceptional 
fathers. In other words, out of 1000 men of mark we may expect 142 in the first case, 
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244 in the second, to be born of exceptional parents, while 858 in the first and 756 in 
the second are born of undistinguished" fathers. In the former case the odds are about 
6 to 1, in the latter 3 to 1 against a distinguished son having a distinguished father. 
This result confirms what I have elsewhere stated, that we trust to the great mass of 
our population for the bulk of our distinguished men. On the other hand it does not 
invalidate what I have written on the importance of creating good stock, for a good 
stock means a bias largely above that due to an exceptional father alone. 

In addition to this the yo ^f 'the population forming the exceptional fathers pro- 
duce 142 or 244 exceptional sons to compare with the 858 or 756 exceptional sons 
pi-oduced by the ^| of the population who are non-exceptional. That is to say that 
the relative ^YoduGtion is as 142 to 45*2, or 244 to 39*8, i.e,, in the one case as more 
than 3 to 1, in the other case as more than 6 to 1. In other words^ exceptional 
fathers produce exceptional sons at a rate 3 to 6 times as great as non-exceptional 
fathers. It is only because exceptional fathers are themselves so rare that we must 
trust for the bulk of our distinguished men to the non-exceptional class. 

Ilhtstration III. Heredity m Coat-colour of Hounds, — To find the correlation 
in coat-colour between Basset hounds which are half-brethren, say, offspring of the 
same dam. 

Here the classification is simply into lemon and white (liv) and lemon, black and 
white or tricolour {t), 

The following is the table for 4172 cases :-— 



Colour. 


L 


Iw. 


Totals. 


L 


1766 


842 


2608 


Iw, 


842 


722 


1564 


Totals 


2608 


1564 


4172 



Proceeding precisely in the same way as in the first illustration we find : 



a 



a, 



logKH 



•25024 
•318,957 

ri57,6378 
•226.234. 



It will be sufficient now to go to ■?■*. We have 



A 



•226,234 = r + •050,867 r^ + •134,480 H + •035,587 r\ 
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The quadratic gives r 
the root we find 



•2237. Using the Newtonian method of approximating to 



r = -2222. 



Summing up as before, after finding the probable errors, we have 



h = Jc- -3190 ± -0133, 
r= -2222 i -0162. 



Illustration IV, Inheritance of Eye-colour in Man,-- To find the correlation in 
eye-colour between a maternal grandmother and her granddaughter. Here the 
classification is into eyes described as grey or lighter, and eyes described as dark grey 
or darker/^ 





Tint. 


Maternal grandmother. 


Totals. 


Grey or lighter. 


Dark grey or 
darker. 


Granddaughter. 


Grey or lighter . ... 


254 


136 


390 


Dark grey or darker . . 

■ ■ — -— — . 


156 


193 


349 


Totals 


410 


329 


739 



As before, we find 



*1 = 


•109,607, 


ag = 


•055,480, 


h = 


•138,105, 


k - 


•069,593, 




log HK = 


~ M96,6267, 






e = 


= •323,760. 





Series for r up to r^ 

•323,760 = r + •004,806r^ + 162-696r^ + -000,358^*. 

The quadratic gives r = '3233, and the biquadratic 

r = -3180, 

the value of the term in r^ being '000,00366, so that higher terms may be neglected. 
Determining the probable errors as in Illustration I., we sum up : — 

■^ According to Mr. Galton's classification, the first group contains eyes described as light blue, blue, 
dark blue, blue-green, grey ; and the second eyes described as dark grey, hazel, light brown, brown, dark 
brown, verv dark brown, black. 
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h = -0696 db 'OSll, 
r = -3180 ± -0361. 

Illustration V. Inheritance of Statiire.~The following data have been found for 
the inheritance of stature between father and son from my Family Data cards, 1078 
caseib ."' " 



Mean stature of father . 



?? 



son . 



e o 9 



» $ 



Standard deviation of father . . 



55 



son 



• « 



67"-698 
68"-661 

2"-7048 



Correlation =: '5198 db '0150- 

Now for purposes of comparison of methods the correlation has been determined 
for this material from various groupings of fathers and sons :— 



(A.) 



Fathers, 



CO 



Class. 


Below 67'^-5. 


Above 67"-5. 


Totals. 


Below 67"-5 . . 


269*25 


95-75 


365 


Above 67"-5 . . 


232-25 


480-75 


713 


Totals ... 


501-5 


576-5 


1078 



(B.) 



Fathers, 



50 



Class. 


Below 66"-5, 


Above 66"-5. 


Totals. 


Below 67"-5 . . 


211-25 


■ 153-75 


365 


Above 67"-5 . . 


152-75 


560-25 


713 


Totals . . , 


364 


714 


1078 
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(C.) 



Fathers. 






Class. 


Below 67"-5. 


Above 67"-5. 


Totals. 


Below 68-5'' . . 


356-25 


182-25 


538-5 


Above 68-5" . . 


145-25 


394-25 


539-5 


Totals . . . 


501-5 


576-5 


1078 



(D.) 



Fathers, 






Class. 


Below 68"-5. 


Above 68"-5. 


Totals. 


Below 69''-5 . . 


506 


182 


688 


Above 69"-5 . . 


149-5 


240-5 


390 


Totals . . . 


655-5 


422-5 


1078 



(E.) 



Fathers, 



=3 



Class. 


Below 69"-5. 


Above 69"-5. 


Totals. 


Below 70"-5 . . 


669 


147 


816 


Above 70"-5 . . 


128 


134 


262 


Totals . . . 


797 


281 


1078 



(F.) 



Fathers, 



Co 

5§ 



Class. 


Below 70'' -5. 


Above 70" -5. 


Totals. 


Below 69^^-5 . . 


641-25 


46-75 


688 


Above 69"-5 . . 


271-75 

■ 


118-25 


390 


Totals . . . 


913 


165 


1078 
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Table of Resvilts. 



Glassificatiom. 

A 
B 

D 

E 


Correlation. 

•5939 ± -0247 
-5557 + -0261 
•5529 + -0247 
-5264 ± -0264 
•5213 ± -0294 
•5524 ± -0307 


Mean of sons. 

L 
68"-64(~- -416,32) 
68"-64 ("^ -416,32) 
68"-50(-~ -001,16) 
68"-53 (-353,71) 
68"-60 (-696,57) 
68"-53 (-353,71) 


Mean of fathers. 


k 
67"-74 ( - -087,00) 
67"-63 (-••418,86) 
67"-74 (--087,30) 
67"-77 (-274,30) 
67"-76 (-641,30) 
67"-73 (1-023,44) 



Now these results are of quite peculiar interest. They show us :— 

(i.) That the probable error of r, as found by the present method, increases with 
h and h But the increase is not very rapid, so that the probable errors of the series 
range only between '025 and '031. Hence while it is an advantage, it is not a very 
great advantage, to take the divisions of the groups near the medians. It is an 
advantage which may be easily counterbalanced by some practical gain in the method 
of observation when the division is not close to the medians. 

(ii.) While the probable error, as found from the present method of calculation, is 
1*5 to 2 times the probable error as found from the product moment, it is by no 
means so large as to seriously weigh against the new process, if the old is un- 
available. It is quite true that the results given by the present process for six 
arbitrary divisions differ very considerably among themselves. But a consideration 
of the probable errors shows that the differences are sensibly larger than the prob- 
able error of the differences, even in some case double ; hence it is not the method 
but the assumption of normal correlation for such distributions which is at fault. As 
we shall hardly get a better variable than stature to hypothesise normality for, we 
see the weakness of the position which assumes without qualification the generality 
of the Gaussian law of frequency. 

(iii.) We cannot assert that the smaller the probable error the more nearly Avill 
the correlation, as given by the present process, agree with its value as found by 
the product moment. If we did we should discard %5213, a very accordant result, 
in favour of '5529, or even '5939. The fact is that the higher the correlation the 
lower, ceteris paribus, the probable error, and this fact may obscure the really best 
result. Judging by the smallness of h and k and of the probable error, we should 
be inclined to select C or the value '5529. This only differs from '5198 by slightly 
more than the probable error of the difference ('033 as compared with '029) ; but 
since both are found from the same statistics, and not from different, samplings ot 
the same population, this forms sufficient evidence in itself of want of normality. 
The approximate character of all results based on the theory of normal frequency 
must be carefully borne in mind ; and all we ought to conclude from the present 
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data for inheritance of stature from father to son would be that the correlation 
= *55 + '015, while the product moment method would tell us more definitely that 
its value was *52 i '015. There is no question that the latter method is the better, 
but this does not hinder the new method from being extremely serviceable; for 
many cases it is the only one available. 

Illustration VI. Effectiveness of Vaccination, — To find the correlation between 
strength to resist small-pox and the degree of effective vaccination. 

We have in the earlier illustrations chosen cases in which in all probability a scale 
of character might possibly, if with difficulty, be determined. In the present case, 
the relationship is a very important one, but a quantitative scale is hardly discover- 
able. Nevertheless, it is of great interest to consider what results flow from the 
application of our method. We may consider our two characters as strength to resist 
the ravages of small-pox and as degree of effective vaccination. No quantitative 
scales are here available ; all the statistics provide are the number of recoveries 
and deaths from small-pox, and the absence or presence of a definite vaccination 
cicatrix. Taking the Metropolitan Asylums Board statistics for the epidemic of 1893, 
we have the table given below, where the cases of " no evidence" have been omitted. 
Proceeding in the usual manner we find 



a^ := -86929 
/^= 1-51139 



OLc 



•54157 



k = -74145 



e = -782454. 
Hence the equation for r is 
•782,454 = T + -560,310r^ -- •096,378r3 + •081,881r^^ - '000,172i'^ 



•040,0597 



,6 



whence r = *5954. 

Summing up we have, after calculating the probable errors, 

h= 1-5114 ± -0287, 
k = *74:U ± -0205, 

r = -5954 ± -0272. 

Strength to resist Small-pox when incurred. 



r 

CD -4-3 
O 



CD 



o 

c3 



<D <D 
> 



r""^ 



Cicatrix. 


Kecoveries. 


— _ _ — __ 

Deaths. 


Total. 


Present 


1562 


42 
94 


1604 


Absent 


383 


477 


Total 


1945 


136 


2081 



Jj 
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We see accordingly that there is quite a large correlation between recovery and the 
presence of the cicatrix. The two things are about as closely related as a child to its 
" mid-parent.'' While the correlation is very substantial and indicates the protective 
character of vaccination, even after small-pox is incurred, it is, perhaps, smaller than 
some over- ardent supporters of vaccination wou.ld have led us to believe. 

Illustration VII, Effectiveness of Antitoxin TreatmenL—lLO measure quanti- 
tatively the effect of antitoxin in diphtheria cases. 

In like manner we may find the correlation between recovery and the administration 
of antitoxin in diphtheria cases. The statistics here are, however, somewhat difficult 
to obtain in a form suited to our purpose. The treatment by antitoxin began in the 
Metropolitan Asylums Board hospitals in 1895, but the serum was then administered 
only in those cases which gave rise to anxiety. Hence we cannot correlate recovery 
and death with the cases treated or not treated in that year, for those who were likely 
to recover were not dosed. In the year 1896 the majority of the cases were, on the 
contrary, treated with antitoxin, and those not treated were the slight cases of very 
small risk ; hence, again, we are in great difficulties in drawing up a table. ^ Further, 
if we compare an antitoxin year with a non-antitoxin year, we ought to compare the 
cases treated with antitoxin in the former year with those which would probably have 
been treated with it in the latter year. Lastly, the dosage, nature of cases treated, 
and time of treatment have been modified by the experience gained, so that it seems 
impossible to club a number of years together, and so obtain a satisfactorily wide 
range of statistics. In 1897, j)ractically all the laryngeal cases were treated with 
antitoxin. Hence the best we can do is to compare the laryngeal cases in two years, 
one before and one after the introduction of antitoxin. The numbers available are 
thus rather few, but will help us to form some idea of the correlation. I take the 
following data from p. 8 of the Metropolitan Asylums Board ^ Beport upon the Use of 
Antitoxic Serum for 1896 ' : — ■ 



Laryngeal cases. 


Eecoveries. 


Deaths. 


Totals. 


With antitoxin, 1896 . , . . 


319 


143 


462 


Without antitoxin, 1894 . , . 


177 


289 


466 


Totals 


496 


432 


928 



^ When a new drug or process is introduced the medical profession are naturally anxious to give every 
patient the possible benefit of it, and patients of course rush to those who first adopt' it. But if the real 
efficiency of the process or drug is to be measured this is very undesirable. No definite data by which to 
measure the effectiveness of the novelty are thus available. 
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Here I find r = "4708 + -0292. 
A fui'ther table is of interest :— 
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Laryngeal cases. 


Eequiring 
tracheotomy. 


Not 
requiring it. 


Totals. 


"Without antitoxiiij 1894 . . . 


261 


205 


466 


With antitoxin, 1896 . . . . 


188 


274 


462 


Totals 


449 


479 


928 



In this case we have r = '2385 ± *0335. 
Lastly, I have drawn up a third table :— 



Total Infantile Cases, Ages — 5 years. 





Recovery. 


Death. 


Totals. 


With antitoxin, 1896 . . * 


912 


434 
556 

990 


1346 


Without antitoxin, 1894 . . 


615 


1171 


Totals . . . . . 


1527 


2517 



Here we have^^ r = '2451 db *0205. 

The three coefficients are all sensible as compared with their probable errors, and 
that between the administration of antitoxin and recovery in laryngeal cases is 
substantial. But the relationship is by no means so great as in the case of vaccina- 
tion, and if its magnitude justifies the use of antitoxin, even when balanced against 
other ills which may follow in its train, it does not justify the sweeping statements of 
its effectiveness which I have heard made by medical friends. It seems until wider 
statistics are forthcoming a case for cautiously feeling the way forward rather than for 
hasty generalisations. 

Illustration VII L Effect on Produce of Superior Stock — To find the effect of 
superiority of stock on percentage goodness of produce. 

To illustrate this and also the formula (Ixxxiii.) for six correlation coefficients, we will 
investigate the efi'ect of selecting sire, dam, and one grandsire on the produce when there 

"^ The values of r for all the three cases of this Illustration were determined with great ease from 
Equation (xxiv.). 
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is selective pairing of dam and sire. We will suppose grandsire, dam, and sire to be 
above the average, and investigate what propoi'tion of the produce will be above the 
average. As numbers very like those actually occurring in the case of dogs, horses, 
and even men, w^e may take 

Correlation of grandsire and offspring . rz: -25 

,, sire or dam and offspring ^=. '5 in both cases 

,, sire and grandsire . . . ==:: '5 

Selective mating for sire and dam. . . zlz -2 

We will suppose zero correlation between paternal grandsire and dam, although 
with selective mating this may actually exist. '^' We have then the following 
system :— 

'14— ^^? '24< — ^> ^34. — ^5 '23"*^ -^J > 12 — • J, / ^^3 — U, 

Hence, substituting these values in (Ixxxvii.), we find — ^after some arithmetic : 

(Q-Qo)/Qo- 1-4851. 

But Qo is the chance of produce above the average if there were no heredity 
between grandsire, sire, and dam, and no assortative mating. 

N 
Hence it equals 4x|-X|-X|-N:::=^— /. Q:=r '1553 N. 

X 

Or, of the produce '5 N above the average, '1553 N instead of '0625 N are born of 
the superior stock owing to inheritance, &c. In other words, out of the '5 N above 
the average, 1553 N are produced by the stock in sire, dam, and grandsire above the 
average, or by '1827 of the total stock.f The remaining '8173 only produce '3447 N, 
or the superior stock produces produce above the average at over twice the rate of the 
inferior stock. Absolutely, the inferior stock being seven times as numerous produces 
about seven-tenths of the superior offspring. 

Illustration IX, Effect of Exceptional Parentage, — Chance of an exceptional 
man being born of exceptional parents. 

Let us enlarge the example in Illustration II., and seek the proportion of exceptional 
men, defined as one in twenty, born of exceptional parents in a community with 
assortative mating. 

■^ A correlation, if there be substantial selective mating, may exist between a man and his mother-in- 
law. Its rumoured absence, if established scientifically, would not, however, prove the non-existence of 
selective mating, for A may be correlated with B and C, but these not correlated with each other. 

t The proportion of pairs of parents associated with a grandsire above the average was found by 
putting '5, '2, and for the three correlation coefficients in (Ixxxv.). In comparing with Illustration II,, 
the reader must remember we there dealt with an exceptional father, 1 in 20, here only with relatives 
above the average— a very less stringent selection. 
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Here we take for father and son 7\,2 = *5, for mother and son r^g = '5, and for 
assortative mating, Tgg — °2. 

We have then to apply the general formulae (Ixxxiii.) and (Ixxxiv.) for the case of 
three variables. We have 

h^ =h^ =h^ = 1-64485 
^^ =^2 ==/38 = -484,795 
v^ ^v{' = v{" = 1*644,850 
v^ — v^' = v^" = 1-705,532 
nj = Vo^' = v{' = - -484,356 
< = vl^ = vl" = - 5-913,290 

Whence, after some arithmetical reduction, we find 

(Q - QoVQo - 20-0389. 

But Qo = -2"o X -To X ifo N = -g-oVo N. Hence Q == '00263 N. 

We must now distinguish between the absolute and relative production of excep- 
tional men by exceptional and non-exceptional parents. The exceptional pairs of 
parents are obtained by (xix.), whence we deduce, putting r = '2, /z^ = ^ := 1*64485, 

^^-^' = -^ -. (^^ + 5) (^ 4 - c) _ d^ l^_ .00971^ 

W ^ W -E 400 "^ ^^^^"^^^ 

Whence the number of pairs of parents, both exceptional 

=: -005245 N. 

Thus, -005245 N pairs of exceptional parents produce -00263 N exceptional sons, 
and '994755 N pairs of parents, non-exceptional in character, produce -04737 N 
exceptional sons, ^.e., the remainder of the -2~q N. The rates of production are thus as 
-5014 to *0476. Or : Pairs of exceptional parents "produce exceptional sons at a rate 
more than ten times as great as pairs of non-exceptional parents. At the same time, 
eighteen times as many exceptional sons are born to non-exceptional as to exceptional 
parents, for the latter form only about |- per cent, of the community. 

The reader who will carefully investigate Illustrations II. , VIII. , and IX. will grasp 
fully why so many famous men are born of undistinguished parents, but will, at the 
same time, realise the overwhelming advantage of coming of a good stock. 
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,, '67449 , ^67449 

:^at^e 14, line 2. ror - -:^3=^:^^ read — . 

J^ Xo V ^ Xo 
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